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Foreword 



rd , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

X the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 



Introduction 



The 3GPP Technical Specifications TS 22.340 [55] and TS 22.141 [56] define the requirements for the 3GPP IP 
Multimedia Subsystem (IMS) based messaging and presence services. This Technical Specification takes the 
requirements into account when defining the minimal baseline and optional media codecs and message container format 
to be used by IMS Messaging and associated Presence service, when supported. 

IMS Messaging services incorporate one or more of the following messaging types Immediate messaging. Deferred 
delivery messaging, and Session based messaging. With Immediate messaging the sender expects immediate message 
delivery in what is perceived as real time compared with Deferred messaging where the sender expects the network to 
deliver the message as soon as the recipient becomes available. With Session based messaging a communications 
association is established between two or more users before communication can take place. In the simplest form Session 
based messaging may be a direct communication between two users. This specification defines the media types and 
container formats for both the Immediate message type and the Session based message type. 

The specification provides the ability to have an interoperable baseline set of media types for messaging and presence 
services, that will simultaneously maximise the technology re-use of the already existing 3GPP services with media 
types, defined in TS 26.140 [13] and TS 26.234 [14]. Simultaneously, the specification will provide the ability to 
indicate the IMS system about the complete set of UE media and storage capabilities relevant for the IMS messaging 
and presence service. 

For IMS terminals capable of Combined CS and IMS (CSI) operation [59] [60], the specification provides an Annex 
with guidelines on how to combine IMS media with CS calls. 
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Scope 



The present document specifies the basic media formats and codecs to be used in the IMS Messaging and Presence 
services, including CSI. It defines the mandatory "baseline" set of media types for the services. Additionally, it also 
targets to allow possible message content type enhancements, either 3GPP-standardized or other generally used media 
types, in a flexible way. 
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3 Definitions, symbols and abbreviations 

3.1 Definitions 

Deferred delivery messaging: A type of IMS Messaging service by which the sender expects the network to deliver 
the message as soon as the recipient becomes available. 

Immediate messaging: A type of IMS Messaging service by which the sender expects immediate message dehvery in 
(near) real time fashion. 

IMS Messaging services: A group of services, supported by capabilities of the 3GPP IP Multimedia Subsystem 3GPP 
TS 22.228 [54], that allows an IMS user to send and receive messages to other users. IMS messaging services comprise 
of one or more types: Immediate messaging. Session based messaging and Deferred dehvery messaging. 
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Session based messaging: A type of IMS Messaging service by which the sender expects immediate message delivery 
in (near) real time fashion. In addition the sender(s) and the receiver(s) have to join to a messaging session e.g. chat 
room, before message exchange can take place. 

continuous media: media with an inherent notion of time, in the present document speech, audio, synthetic audio and 
video. 

static media: media that itself does not contain an element of time, in the present document all media not defined as 
continuous media. 

scene description: description of the spatial layout and temporal behaviour of a presentation, it can also contain 
hyperlinks. 

3.2 Abbreviations 

3GP 3GPP file format 

AAC Advanced Audio Coding 

AMR Adaptive Multi-rate Codec 

AVC Advanced Video Coding 

CC/PP Composite Capability/Preference Profiles 

CSI Combination of CS and IMS services 

DLS Downloadable Sounds 

Enhanced aacPlus MPEG-4 High Efficiency AAC plus MPEG-4 Parametric Stereo 

EXIF Exchangeable image file format 

GIF Graphics Interchange Format 

H.263 ITU-T video codec 

IP Internet Protocol 

IMS IP Multimedia Subsystem 

ITU-T International Telecommunications Union - Telecommunications 

JFIF JPEG File Interchange Format 

JPEG Joint Picture Expert Group 

MIDI Musical Instrument Digital Interface 

MIME Multipurpose Internet Mail Extensions 

MM Multimedia Message 

MMS Multimedia Messaging Service 

MPEG Motion Picture Expert Group 

MP4 MPEG-4 file format 

PSS Packet-switched Streaming Service 

SBR Spectral Band Replication 

SP-MIDI Scalable Polyphony MIDI 

SVG Scalable Vector Graphics 

UTF-8 Unicode Transformation Format (the 8-bit form) 

XMF Extensible Music Format 



Formats for Static IVIedia 



Multiple media elements shall be combined into a composite single IMS message using MIME multipart content type 
format as defined in RFC 2046 [25]. The media type of a single IMS message element shall be identified by its 
appropriate MIME type whereas the media format shall be indicated by its appropriate MIME subtype. 

In order to guarantee a minimum support and compatibility between IMS Messaging and Presence Service capable 
terminals and OMA IMPS 1 . 1 capable terminals, IMS Messaging User Agent and IMS Presence User Agent supporting 
specific media types shall comply with the following selection of media formats: 

4.1 Text 

Plain text. Any character encoding (charset) that contains a subset of the logical characters in Unicode [2] shall be used 
(e.g. US-ASCII [3], ISO-8859-1 [4], UTF-8 [5], ShiftJIS, etc.). 
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Unrecognized subtypes of "text" shall be treated as subtype "plain" as long as the MIME implementation knows how to 
handle the charset. Any other unrecognized subtype and unrecognized charset shall be treated as 
"application/octet - stream". 



4.2 Still Image 



For IMS terminals supporting still images, ISO/IEC JPEG [8] together with JFIF [9] shall be supported. The support for 
ISO/IEC JPEG only apply to the following two modes: 

mandatory: baseUne DCT, non-differential, Huffman coding, as defined in table B.l, symbol 'SOFO' in [8]; 

optional: progressive DCT, non-differential, Huffman coding, as defined in table B.l, symbol 'SOF2' [8]. 

For JPEG baseline DCT, EXIF compressed image file format should also be supported, as defined in [58]. In that case 
there is no requirement for the MMS Messaging and Presence client to interpret or present the EXIF parameters 
recorded in the file. 

4.3 Bitmap Graphics 

For IMS terminals, supporting bitmap graphics, the following bitmap graphics formats should be supported: 

- GIF87a [15]; 

- GIF89a [16]; 

- PNG [17]. 



Formats for Continuous Media 



In order to guarantee a minimum support and compatibility between IMS Messaging and Presence Service capable 
terminals and MMS capable terminals that offer support of continuous media formats (section 5) and media 
synchronisation and scene description (see section 6), IMS Messaging User Agent and IMS Presence User Agent 
supporting specific media types should in addition to formats listed in section 4 of this document comply with the 
following selection of media formats: 

5.1 Speech 

For IMS terminals supporting speech, the AMR codec shall be supported for narrow-band speech [26] [40] [41] [42]. 

The AMR wideband speech codec [27] [43][44][45] shall be supported when wideband speech working at 16 kHz 
sampling frequency is supported. 

When using speech media type alone, AMR or AMR-WB data stored according to the file format specified in [32] 
should be supported. The mandatory format is defined in clause 5.4. 

Multi-channel sessions shall not be used. 

5.2 Audio 

For IMS terminals supporting audio, one or both of the following two audio codecs should be supported: 

- Enhanced aacPlus [49] [50] [5 1 ] 

- Extended AMR-WB [46] [47] [45] 

There is no requirement that a terminal supporting decoding by one of the codecs shall also support encoding by that 
codec. 
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Specifically, based on the audio codec selection test results Extended AMR-WB is strong for the scenarios marked with 
blue. Enhanced aacPlus is strong for the scenarios marked with orange, and both are strong for the scenarios marked 
with green colour in the table below: 



Content type 
Bit rate 


Music 


Speech over Music 


Speech between 
Music 


Speech 


14 kbps mono 










18 kbps stereo 










24 kbps stereo 










24 kbps mono 










32 kbps stereo 










48 kbps stereo ^^^^^^^^ 


^^^■^^^B 


^^^^1 



More recent information on the performance of the codecs based on more recent versions of the codecs can be found in 
TR 26.936 [62]. 

Enhanced aacPlus decoder is also able to decode MPEG-4 AAC LC content. 

Extended AMR-WB decoder is also able to decode AMR-WB content. 

In addition, MPEG-4 AAC Low Complexity and MPEG-4 AAC Long Term Prediction object types [19] may be 
supported. The maximum sampling rate to be supported by the decoder is 48 kHz. The channel configurations to be 
supported are mono (1/0) and stereo (2/0). 



5.3 



Video 



For IMS terminals supporting video, ITU-T Recommendation H.263 [10] [11] profile level 45 shall be supported. In 
addition: 

- H.263 Profile 3 Level 45 [ 1 0] [ 1 1 ] ; 

- MPEG-4 Visual Simple Profile Level Ob, [12]; 

- H.264 (AVC) Baseline Profile Level lb [52][53] with contstraint_setl_flag=l; 

should be supported. There are no requirements on output timing conformance of H.264 (AVC) decoding (Annex C of 
[52]). 

An optional video buffer model is given in Annex G of document [14]. It shall not be used with H.264 (AVC). 

NOTE: ITU-T Recommendation H.263 profile has been mandated to ensure that video-enabled IMS Messaging 
& Presence user agent supports a minimum baseline video capability. Both H.263 and MPEG-4 Visual 
decoders can decode an H.263 profile bit stream. It is strongly recommended, though, that an H.263 
profile bit stream is transported and stored as H.263 and not as MPEG-4 visual (short header), as 
MPEG-4 Visual is not mandated by IMS Messaging & Presence services. 

5.4 File Format for video and associated speech/audio media 
types 

To ensure interoperability for the transport of video and associated speech/audio in an IMS Messaging and Presence 
client, the 3GPP file format with Basic profile shall be supported. 
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The usage of the 3GPP file format shall follow the technical specifications and the implementation guidelines specified 

in TS 26.244 [33]. 



5.5 Synthetic audio 



For IMS terminals supporting synthetic audio, the Scalable Polyphony MIDI (SP-MIDI) content format defined in 
Scalable Polyphony MIDI Specification [28] and the device requirements defined in Scalable Polyphony MIDI Device 
5-to-24 Note Profile for 3GPP [29] should be supported. 

SP-MIDI content is delivered in the structure specified in Standard MIDI Files 1.0 [31], either in format or format 1. 

In addition the Mobile DLS instrument format defined in [38] and the Mobile XMF content format defined in [39] 
should be supported. 

A MSS client supporting Mobile DLS shall meet the minimum device requirements defined in [38] in section 1.3 and 
the requirements for the common part of the synthesizer voice as defined in [38] in sections 1.2.1.2. If Mobile DLS is 
supported, wavetables encoded with the G.711 A-law codec (wFormatTag value 0x0006, as defined in [38]) shall also 
be supported. The optional group of processing blocks as defined in [38] may be supported. Mobile DLS resources are 
delivered either in the file format defined in [38], or within Mobile XMF as defined in [39]. For Mobile DLS files 
delivered outside of Mobile XMF, the loading application should unload Mobile DLS instruments so that the sound 
bank required by the SP-MIDI profile [29] is not persistently altered by temporary loadings of Mobile DLS files. 

Content that pairs Mobile DLS and SP-MIDI resources is delivered in the structure specified in Mobile XMF [39]. As 
defined in [39], a Mobile XMF file shall contain one SP-MIDI SMF file and no more than one Mobile DLS file. MMS 
clients supporting Mobile XMF must not support any other resource types in the Mobile XMF file. Media handling 
behaviours for the SP-MIDI SMF and Mobile DLS resources contained within Mobile XMF are defined in [39]. 



5.6 Vector graphics 



For IMS terminals supporting 2D vector graphics, the Scalable Vector Graphics (SVG) Tiny 1.2 format [20] [21] and 
ECMAScript [54] shall be supported. 

NOTE 1: The compression format for SVG content is GZIP [35], in accordance with the SVG specification [20]. 

NOTE 2: Only media formats supported by IMS Messaging and Presence, as specified in clauses 4 and 5 of this 

specification, shall be used. MMS Messaging and Presence clients do not support the Ogg Vorbis format. 

NOTE 3: Content creators of SVG Tiny 1.2 for IMS Messaging and Presence clients are strongly recommended to 
follow the content creation guidelines provided for PSS clients in Annex L of [14]. 

NOTE 4: If SVG Tiny 1.2 will not be published within a reasonable timeframe, the decision to adopt SVG Tiny 1.2 
in favour of SVG Tiny 1 . 1 may be reconsidered. 



6 IVIedia synchronisation and presentation format 

The 3GPP IMS Messaging and Presence uses a subset of SMIL 2.0 [24] for media synchronisation and scene 
description. IMS clients and servers with support for media synchronization and scene descriptions shall support the 
3GPP SMIL Language Profile defined in [34]. 

This profile is a subset of the SMIL 2.0 Language Profile but a superset of the SMIL 2.0 Basic Language Profile. 
Document [34] also includes an informative annex A that provides guidelines for SMIL content authors. 

Additionally, XHTML Mobile Profile [30] for scene description should be supported. IMS clients and servers with 
support for scene descriptions based on XHTML shall support XHTML Mobile Profile [30], defined by the WAP 
Forum. 

- XHTML Mobile Profile is a subset of XHTML 1 . 1 but a superset of XHTML Basic. 
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Annex A (informative): 
CSI Handling 

A.1 Introduction 

The Combination of CS and IMS services (CSI) is an operation mode combining circuit switch calls and IMS services, 
where the UE presents the CS and IMS services within one context to the user [59] [60]. However, the capabilitiy to 
simultaneously render certain media types of a CS call and IMS session may be limited by a UE and capability 
exchange alone may not be enough to resolve such conflicts. For instance: 

During a CS speech call, a UE may not be able to render additional speech accompanying a video clip in an IMS 
session. This limitation is not clear if the UE has indicated that it is capable of receiving video clips. 

During a CS multimedia call, a UE may not be able to both display video from the CS call and images from the 
IMS session. Although the UE is not capable to fully render images and video simultaneously, it may be possible 
to view images in front of video. 

The above conflicts are resolved by applying default rules specified in [59]. This Annex describes the UE behaviour for 
a number of scenarios drawn from the rules in [59]. This list may be extended in future versions of this specification. 

Note that the IMS media types and formats applicable to CSI are specified in: 

clauses 6 and 9 of reference [61] for streamed media; 

clauses 4 and 5 of the present document for media delivered in messages. 



A.2 Sharing personal content during CS voice call 

In a person-2 -person communication, participants can combine a CS voice call with an IMS session and share content 
such as still images and video. In particular participants may share media content that is (or has been) created by the 
participants in the session. 

TS 22.279 [59] defines that if media, or parts thereof, accepted by a user cannot be rendered by the UE simultaneously 
with the CS call, conflicts shall be resolved such that the user is presented with CS speech with preference over IMS 
speech/audio. 



A.3 Sharing personal content during CS multimedia call 

In a person-2 -person communication, participants can combine a CS multimedia call (3G-324M) with an IMS session 
and share content such as still images. In particular participants may share media content that is (or has been) created by 
the participants in the session. 

TS 22.279 [59] defines that if media, or parts thereof, accepted by a user cannot be rendered by the UE simultaneously 
with the CS call, conflicts shall be resolved such that the user is presented with: 

CS speech with preference over IMS speech/audio; 

IMS video and images with preference over CS video. 
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